Cross-lingual studies of ASR errors: paradigms for perceptual evaluations

نویسندگان

  • Ioana Vasilescu
  • Martine Adda-Decker
  • Lori Lamel
چکیده

It is well-known that human listeners significantly outperform machines when it comes to transcribing speech. This paper presents a progress report of the joint research in the automatic vs human speech transcription and of the perceptual experiments developed at LIMSI that aims to increase our understanding of automatic speech recognition errors. Two paradigms are described here in which human listeners are asked to transcribe speech segments containing words that are frequently misrecognized by the system. In particular, we sought to gain information about the impact of increased context to help humans disambiguate problematic lexical items, typically homophone or near-homophone words. The long-term aim of this research is to improve the modeling of ambiguous contexts so as to reduce automatic transcription errors.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross-Lingual Study of ASR Errors: On the Role of the Context in Human Perception of Near-Homophones

It is widely acknowledged that human listeners significantly outperform machines when it comes to transcribing speech. This paper presents a paradigm for perceptual experiments that aims to increase our understanding of human and automatic speech recognition errors. The role of the context length is investigated through perceptual recovery of small homophonic words or near-homophones yielding f...

متن کامل

Design of Cross-lingual and Multilingual Corpora for Speaker Recognition Research and Evaluation in Indian Languages

Automatic Speaker Recognition (ASR) is an economic method of biometrics because of the availability of the low cost and powerful processors. Results of ASR are highly dependent on database, i.e., the results obtained in an ASR system are meaningless if the recording conditions are not of standard. In this paper, a methodology and a typical experimental setup used for development of corpora for ...

متن کامل

Analysis of Mismatched Transcriptions Generated by Humans and Machines for Under-Resourced Languages

When speech data with native transcriptions are scarce in an under-resourced language, automatic speech recognition (ASR) must be trained using other methods. Semi-supervised learning first labels the speech using ASR from other languages, then re-trains the ASR using the generated labels. Mismatched crowdsourcing asks crowd-workers unfamiliar with the language to transcribe it. In this paper, ...

متن کامل

A perceptual investigation of speech transcription errors involving frequent near-homophones in French and american English

This article compares the errors made by automatic speech recognizers to those made by humans for near-homophones in American English and French. This exploratory study focuses on the impact of limited word context and the potential resulting ambiguities for automatic speech recognition (ASR) systems and human listeners. Perceptual experiments using 7-gram chunks centered on incorrect or correc...

متن کامل

Cross-lingual portability of MLP-based tandem features - a case study for English and Hungarian

One promising approach for building ASR systems for lessresourced languages is cross-lingual adaptation. Tandem ASR is particularly well suited to such adaptation, as it includes two cascaded modelling steps: feature extraction using multi-layer perceptrons (MLPs), followed by modelling using a standard HMM. The language-specific tuning can be performed by adjusting the HMM only, leaving the ML...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012